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Abstract — In this work we study an Arbitrarily Varying Chan- 
nel (AVC) with quadratic power constraints on the transmitter 
and a so-called "oblivious" jammer (along with additional AWGN) 
under a maximum probability of error criterion, and no private 
randomness between the transmitter and the receiver. This is in 
contrast to similar AVC models under the average probability of 
error criterion considered in |1|, and models wherein common 
randomness is allowed (2j - these distinctions are important in 
some communication scenarios outlined below. 

We consider the regime where the jammer's power constraint 
is smaller than the transmitter's power constraint (in the other 
regime it is known no positive rate is possible). For this regime 
we show the existence of stochastic codes (with no common 
randomness between the transmitter and receiver) that enables 
reliable communication at the same rate as when the jammer 
is replaced with AWGN with the same power constraint. This 
matches known information-theoretic outer bounds. In addition 
to being a stronger result than that in | T| (enabling recovery of 
the results therein), our proof techniques are also somewhat more 
direct, and hence may be of independent interest. 



I. Introduction 

Aerial Alice is flying in a surveillance plane high over 
Hostile Harry's territory. She wishes to relay her observations 
of Harry's troop movements back to Base-station Bob over n 
channel uses of an AWGN channel with variance a 2 . Harry 
obviously wishes to jam Alice's transmissions. However, both 
Alice's transmission energy and Harry's jamming energy are 
constrained - they have access to energy sources of nP and 
nA Joules respectively^ Harry already knows what message 
Alice wants to transmit (after all, he knows the movements of 
his own troops), and also roughly how she'll transmit it (i.e., 
her communication protocol/code, having recently captured 
another surveillance drone) but he doesn't know exactly how 
she'll transmit it {i.e., her codeword - for instance, Alice could 
choose to focus her transmit power on some random subset of 
the n channel uses). Further, since Alice's transmissions are 
very quick, Harry has no time to tune his jamming strategy to 
Alice's actual codeword - he can only jam based on his prior 

'These are so-called peak power constraints - they must hold for all 
codewords, rather than averaged over all codewords average power constraints. 
If the peak power constraints are relaxed to average power constraints, for 
either Alice's transmissions, or Harry's jamming (or both), it is known (2) 
that standard capacity results do not hold - only "A-capacities" exist. 



knowledge of Alice's code, and her message]^] 

Even in such an adverse jamming setting we demonstrate 
that Alice can communicate with Bob at a rate equalling 
1 + . 2 ) as long as P > A. Note that this equals the 



capacity of an AWGN with noise parameter equal to A + a 
- this means that no "smarter" jamming strategy exists for 
Harry than simply behaving like AWGN with variance A. If 
P < A no positive rate is possible since Harry can "spoof" 
by transmitting a fake message using the same strategy as 
Alice - Bob is unable to distinguish between the real and fake 
transmissions^] 

A. Relationship with prior work 

The model considered in this work is essentially a special 
type of Arbitrarily Varying Channel (AVC) for which, to the 
best of our knowledge, the capacity has not been characterized 
before in the literature. The notion of AVCs was first intro- 
duced by Blackwell et al. J6), Q, to capture communication 
models wherein channel have unknown parameters that may 
vary arbitrarily during the transmission of a codeword. The 
case when both the transmitter and the jammer operate under 
constraints (analogous to the quadratic constraints in this work) 
has also been considered 0, For an extensive survey on 
AVCs the reader may refer to the excellent survey and the 
references therein. 

The class of AVCs over discrete alphabets has been studied 
in great detail in the literature 0. However, less is known 
about AVCs with continuous alphabets. The bulk of the work 
on continuous alphabet AVCs (outlined below in this section) 
focuses on quadratically-constrained AVCs. This is also the 
focus of our work. 

It is important to stress several features of the model 
considered in this work, and the differences with prior work: 

• Stochastic encoding: To generate her codeword from her 
message, Alice is allowed to use private randomness 
(known only to her a priori, but not to Harry or Bob. 
This is in contrast to the deterministic encoding strategies 

Alternatively, Alice could split her energy budget to concurrently transmit 
one symbol on n different frequencies - these together could comprise her 
codeword. Given such a strategy, since Harry doesn't know Alice's codeword, 
he is unable to make his jamming strategy depend explicitly on the codeword 
Alice actually transmits. 

3 Such a jamming strategy is equivalent to the more general symmetrizability 
condition in the AVC literature (see, for instance 0, (4), and (5)). 



often considered in the information theory/coding theory 
literature, wherein the codeword is a deterministic func- 
tion of the message. 

• Public code: Everything Bob knows about Alice's trans- 
mission a priori, Harry also knowsj^] This is in contrast 
to the randomized encoding model also considered in the 
literature (see for instance [2|, |9|), in which it is critical 
that Alice and Bob share common randomness that is 
unknown to Harry. 

• Message-aware jamming: The jammer is already aware of 
Alice's message. This is one important difference in our 
model, from the model in the work closest to ours, that 

of ID. 

• Oblivious adversary: The jammer has no extra knowledge 
of the codeword being transmitted than what he has al- 
ready gleaned from his knowledge of Alice's code and her 
message. This is in contrast to the omniscient adversary 
often considered in the coding theory literature. 

These model assumptions are equivalent to requiring public 
stochastic codes with small maximum error of probability 
against an oblivious adversary. Several papers also operate 
under some of these assumptions, but as far as we know, none 
examines the scenario where all these constraints are active. 

The literature on sphere packing focuses on an AVC model 
wherein zero-error probability of decoding is required (or, 
equivalently, when the probability (over Alice's codeword and 
Harry's jamming actions) of Bob's decoding error is required 
to equal zero). Inner and outer bounds were obtained by Blach- 
man [10|, [11 J. Like several other zero-error communication 
problems (including Shannon's classic work lfl2l ) characteri- 
zation of the optimal throughput possible is challenging, and 
in general still an open problem^] 

Other related models include: 

• The vector Gaussian AVC (16). As in the "usual" vector 
Gaussian channels, optimal code designs require "water- 
filling". 

• The per-sequence/universal coding schemes in II 1 71 . 

• The correlated/myopic jammers in ifTSIl . lfT9l , wherein 
jammers obtain a noisy version of Alice's transmission 
and base their jamming strategy on this. 

• The joint source-channel coding, and coding with feed- 
back models considered by Bagar [20), ED . 

• Several other AVC variants, including dirty paper coding, 
in 
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Table II 

Examples of our notation convention for different variables. 



Symbol 


Meaning 


9{i) 


Stochastic encoder applied to the message i 


<t>(Y) 


Deterministic decoder 


e(s,i) 


Error probability (over the stochastic encoder and the chan- 
nel noise) for a fixed message i and jamming vector s 


Cmax(s) 


Maximum (over messages) error probability for a fixed 
jamming vector s 


N(a,cr' z ) 


Gaussian random variable with mean a and variance o"* 


B n {c,r) 


A ball of radius r in W 1 which centered at c 6 R n 



Table III 
Commonly used symbols. 



II. Notation and Problem Statement 
A. Notation 

Throughout the paper, we use capital letters to denote ran- 
dom variables and random vectors, and corresponding lower- 
case letters to denote their realizations. Moreover, bold letters 
are reserved for vectors and calligraphic symbols denote sets. 
Random sets are represented by an extra star as superscripts. 
Some constants are also denoted by capital letters. Our con- 
vention is summarized in Table [II] 

We use N(a,a 2 ) to denote for a Gaussian random variable 
with mean a and variance a 1 . To denote a ball in an n- 
dimensional real space of radius r which centered at the point 
c G R", we write B n (c,r). In Table III we summarize the 
notation used in this paper. 



B. Problem Statement 

In this paper we study the capacity of a quadratic constrained 
AVC with stochastic encoder under the attack of a malicious 
adversary who knows the transmitted message but is oblivious 
to the actual transmitted codewords. 

Let the input and output of the channel are denoted by the 
random variables X and Y where X, Y e R. Then, formally, 
the channel is defined as follows 



Y = X + S + V, 



(1) 



We summarize some of the results mentioned above in 
Table H] 



where S 6 K is the channel state chosen by a malicious 
adversary and V ~ N(0,a 2 ) is Gaussian random variable. 
Here we assume that the noise V is independent over different 
uses of channel The channel input is subjected to a peak 
power constraint as follows 



4 This requirement is an analogue for communication of Kerckhoffs' Princi- 
ple 1 8 1 in cryptography, which states that in a secure system, everything about 
the system is public knowledge, except possibly Alice's private randomness. 

5 The literature on Spherical Codes (see JT3j, |14|, and |15] for some rela- 
tively recent work) looks at the related problem of packing unit hyperspheres 
on the surface of a hypershere. This corresponds to design of codes where 
each codeword meets the quadratic power constraint with equality, rather than 
allowing for an inequality. 
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< nP, 



and the permissible state sequences are those satisfying 



i=l 



< nA. 
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Table I 

Comparison of existing results on Quadratic-constrained AVCs with AWGN. 
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Figure 1. A power constraint AVC with stochastic encoder. Here we assume 
that the adversary has access to the transmitted message i but not to the 
transmitted codeword x n (i, t). 

The problem setup is depicted pictorially in Figure [T] 

A code with stochastic encoder (4 r , 0) of block-length n 
consists of a set of encoders that are denoted by a random 
variable W : {1, . . . , M} W 1 and a deterministic decoder 
<j> : E" i — ^ {0, . . . , M} where denote for an error and M = 
e nR is the number of messages^] Each encoder ip is constructed 
by a set of codewords {xi, . . . , xm} from M. n . 

Here in this paper, we focus on the maximum probability 
of error. First, for a fixed jamming vector s, let us define the 
probability of error given that the message i has been sent as 
follows 

e{a,i)^¥ %v [<f>(9(i) + s + V)^t\. (4) 

Then the maximum probability of error for a fixed s is defined 
by 

e m ax(s) — max e(s,i). (5) 

! I M\ 

Now the capacity for the above channel can be stated as in 
Definition Q] 

Definition 1. The capacity C of an AVC with stochastic en- 
coder under the quadratic transmit constraint P and jamming 
constraint A is the supremum over the set of real numbers 
such that for every 5 > and sufficiently large n there exist 
codes with stochastic encoder ('J', (f>) that satisfies the following 
conditions. First, for the number of messages M encoded by the 
code we have M > exp(n(C — S)). Moreover, each codeword 
satisfies the quadratic constraint Q and finally for the code 
we have 

lim sup e max (s) = 0. 

™^°° S: || S || 2 <nA 
6 For notational convenience we assume that e nR is an integer. 



III. Main Results 

The main results of the paper, stated in Theorem [T] and its 
corollary. 

Theorem 1. The capacity of a quadratic-constrained AVC 
channel under the maximum probability of error criterion with 
transmit constraint P and jamming constraint A and additive 
Gaussian noise of power a 2 is given by 

C= { l l0 S(l + A^) ifP > A ' 

|_ Otherwise. 

Remark 1. The result of Theorem [7] matches the result of 
stochastic encoder over discrete alphabets H23V , /|5] Theo- 
rem 7], in which it is shown that for the average probability of 
error criterion, using a stochastic encoder doesn 't increase the 
capacity. Because the number of possible adversarial actions 
here is uncountably large, the technique of 4231/ . which relies 
on taking a union bound over at most exponential-sized set of 
possible adversarial actions, does not work. 

Corollary 1. The capacity of a quadratic-constrained AVC 
under the maximum probability of error criterion with transmit 
constraint P and jamming constraint A is given by 

c= ( ilog(l + f) ifP>A, 
1 Otherwise. 

IV. Proof of Main Results 

In this section, we present the proof of Theorem [T] and its 
corollary. The proof of the converse parts of Theorem [T] is 
stated in Section IIV-BI 

For the achievability part of Theorem [T] we claim that the 
same minimum distance decoder proposed in (TJ to achieve the 
capacity for the average probability of error criterion, which is 
given by 

_ / i if \\y- x i\\ 2 < \\y-Xj\\ 2 , iox j^i, 
9yy) \ if no such i : 1 < i < M exists, W 

also achieves the capacity for the maximum probability of error 
criterion. 

Note that in order to show the suprimum over s subject to 
•(3) of e max (s) goes to zero it is sufficient to show that for every 
message i the suprimum over s subject to ([3} of e(s,i) goes 
to zero. 

To communicate, Alice (the transmitter) randomly picks a 
codebook C and fixes it. The codebook C comprises e n ^ 0+Fl ^ 
codewords x(i,t), 1 < i < e nR and 1 < t < e nd °, each 
chosen uniformly at random and independently from a sphere 
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Figure 2. (a) The codebook is constructed such that for sending a message 
i £ {1, . . . , e nR } the encoder chooses one of the e nS o codewords randomly 
from the ith row of the above table, (b) Assuming that the codeword x(i, t) is 
sent, in our model an error occurs if the ML decoder declares x(j, t') for some 
j ^ i. Note that there is no error if the decoder declares another codeword 
from the ith row. 



of radius \fnP as it is shown in Figure [2] (caption (a)). Then, 
the ith row of the codebook, i.e., {x(i, 1), . . . , x(i, e nS °)}, is 
assigned to the ith message. In order to transmit the message 
i, the encoder randomly picks a codeword from the ith row of 
the codebook and sends it over the channel. 

Now, given that the message i has been transmitted, the 
error probability e(s,i) of an stochastic code used over a 
quadratic-constrained AVC under the use of the minimum 
distance decoder (defined by |6|) equals 

e(s, i) =P*, y [</> (*(») + s + V) ^ i] 

\\x(i,T) + s + V-x(j,t')\\ 2 

< \\s + V\\ 2 for some i ^ j and t' 

(x(j,t'),x(i,T) + s + V) > nP 



, e 



nS 



}. Figure [2] (caption (b)) pictorially 



over the set {1, . 
demonstrates the decoding errors at the decoder. 



A. Achievability proof of Theorem ^ 

The main step in proving the achievability part of Theorem [T] 
consists in asserting the doubly exponential probability bounds 
which is stated in Lemma Q] 

Lemma 1. Let C* = {X(i,t)} in which 1 < i < exp(ni?) 
and 1 < t < exp(n5o) be a random codebook comprises of 
independent random vectors X(i,t) each uniformly distributed 
on the n-dimensional sphere of radius \J nP. First, fix a vector 
s G B n (0, V nA). Then for every So > Si > and for 



-nSi 



sufficiently large n if R < | log ( 1 + ^J^r^j we have 
\¥ v [(X(j, t'), X{i, T) + s + V)>nP 

+ (X(i, T),s + V) for some j ^ i and t'\ > Ke 

< exp ( - (K log 2 - 10)exp((<5 - Si)n) 

Proof: For the proof refer to the appendix. ■ 

Lemma 2 (Quantizing Adversarial Vector). For a fixed jam- 
ming vector s, for sufficiently small e > 0, and for every 
Sq > Si > 0, there exists a codebook C = {x(i,t)} of rate 
R < \ l°g(l + A+cr 2 ) com P r ^ ses °f vectors x(i,t) £ R n of 
size VnP with 1 < i < e nR and 1 < t < e nS ° which performs 
well over the AVC defined in Section |77|/or all s' € B n (s,e), 
i.e., it satisfies 

e(s, i) = V T V V [{x{j, 0, x(i, T) + s + V) 

> nP + (x(i, T), s + V) for some j ^ i and t' 

<Kexp(-nSi) (8) 

for all s' & B n (s, e). 

Proof: For a particular s, instead of (|S), let us assume that 
the code C satisfies a stronger condition 

P T Pv (x(j, t'),x{i, T) + s + V) >nP 

— 2sV nP + (x(i, T), s + V) for some j ^ i and t' 
<Kexp(-n5i). (9) 

Then it can be verified that for all s' £ B n (s,e) the code 
C satisfies |8]l where s is replaced by s'. To show this let 
s' = s+pu where u is an arbitrary unit vector and p E [—e, e]. 
Hence for all s' £ B n (s,e) we can write 



+ (x(i,T),s + V) for some j^i and t' . (7) e ( s '^) 



where T is a uniformly distributed random variable defined 



< 



(a) 



(x(j,t'),x(i,T) + s' + V) 

>nP+(x{i : T),s' + V) for some j ^ i and t' 

VPy [{x(j, t'),x(i, T) + s + V)+ p(x(j, t'),u) 
> nP + (x(i,T),s + V) 

+ p(x(i, T), u) for some j ^ i and t' 

• T Pv t'),x(i, T) + s + V)+ s^nP 
>nP+(x(i,T),s + V) 

— eVnP for some j ^ i and t' 



< ifexp(— nSi), 

where (a) follows from Q. 

Now, in Lemma [T] we can use the stronger error requirement 
(|9| to show that there exists a code which satisfies |9]). This 
stronger requirement results in a rate loss, but as e goes to zero 
the rate loss due to that vanishes. By the above argument, we 



know that this code satisfies <|8j for all s' € B n (s,e) and we 
are done. ■ 
Finally, Lemma [3] shows the existence of a good codebook 
for the quadratic constrained AVC problem with stochastic 
encoder which have been introduced in Section ITl-B I and hence 
completes the proof of Theorem [T] 

Lemma 3 (Codebook Existence). For every So > Si > and 
n > no(So,Si) there exist a codebook C — {x(i,t)} of rate 
R < |l°g(l + a^+h) com P r i ses of vectors x(i,t) € E" of 
size \JnP with 1 < i < e nR and 1 < t < e n5 ° such that for 
every vector s and every transmitted message i we have 

e(s, i) = F T P V [{x(j, t'),x(i, T)+s + V) 

> nP + (x(i, T),s + V) for some j ^ j and t' 
< A'cxp(-n<5i). (10) 

Proof: For any fixed codebook C = {x(i,t)}, let us 
explicitly mention to the dependency of the error probability 
on C by defining ec(s, i) = e(s, i). Then in order to prove the 
assertion of lemma we can equivalently show that 

liminfP c « [Vs,Vi ee.(s,i) < Ke- nSl ] > 0. 

n— >oo 

However, by using Lemma [2j it is not necessary to check for 
all s but only for those belonging to an e-nej^J \n that covers 

£ n (0,vnA). 

Hence, we can write 

Pc* [Vsexn.Vi ec-(*,i) <Ke~ nh ] 

= 1 - P c » [3s e X n,3i e c * (s, i) > Ke- nS >] 

> 1 - Y, E Pc * M^M) > Ke- nS i] , 

where (a) follows from the union bound. 

Now, note that to bound \x n \ one might cover £>„(0, vnA) 
by a hypercube of edg^e size 2\J nA; see Figure 3 So we can 
write \xn\ < [~jr^j ■ Then, by using Lemma 

Pc- [V^Xn,V* ec.(s,i) <Ke- nSl ] 

/ , \ n 

2VnA 



we have 



> 1 



x e nR x exp 



-K'e 



I „n(S -S 



where, assuming 8$ > Si, the right hand side goes to 1 as n 
goes to infinity and this completes the proof of lemma. ■ 

B. Converse proof of Theorem [7] 

The converse of Theorem [TJ follows by combining two 
different upper bounds on the capacity. The first bound follows 
by observing that if the randomness of the stochastic encoder 
is also shared with the decoder we can achieve higher rates. 
So by using result of for randomized codes^J we have 

7 An e-net is a set of points in a metric space such that each point of the 
space is within distance e of some point in the set. 

8 Similar to our work, (2) also considers the maximum probability of error 
criterion. 




Figure 3. This figure shows that how the whole sphere 13(0, \/A) can be 
covered by e-dense subsets \ n . Here the set \ n comprises of points from a 
hyper-cubical lattice. 



the following upper bound on the capacity of an AVC with 
stochastic encoder 



C<\\og 



1 



P 



A + cr 2 

Now, it only remains to show that C = for P < A where 
we use a similar argument to Q (also see (TJ). To this end, 
we show that the adversary can fool the decoder and make it 
confused. Because P < A, the adversary can use a stochastic 
encoder ^' with the same probabilistic characteristic of ^ 
where we assume that and Vl/' are independent Then for 
any decoder <f> and for any i ^ j we can write 

P 0&(i) + + V) ^ i] 

= P[0(*(j) + *'(i) + V)^»] 
= l-P[0(*(j) + * / (*) + V) = t] 
> l-P[^(*(j) + *'(i) + V) ^j]. 



Hence we have 
1 



M 
3=1 



M 



M 2 

1 

M 2 



M 



Y P^(*(i) + *'0') + V)^i] 



> 



> 



M 



1 

M 2 ^ 

«.3=1 



[p[0(*(i) + *'(i)+V)^i] 



-P[(^(*(j) + *'(« 
1 M(M-l) 



M 2 2 
1 

where M = e n . This shows that 



1 M 

— Y 

3=1 



E[e max (vl/'(j))] > 



9 Such a jamming strategy is equivalent to the notion of symmetrizability 
condition in the AVC literature (see, for instance 0, (4), and (5)). 



which means there exists at least a k such that 
E [e max ( x I''(A;))] > \ and this completes the proof. 

Appendix 

Fact 1. For two events A and B we can write 

F[A] = F[A n (B U B)] < P[B] + P[.A n B]. 

Our proof requires the following "martingale concentration 
lemma" proven in |fl] Lemma Al]. 

Lemma 4 ([ 1 , Lemma Al]). Let X\, . . . , Xl be arbitrary r.v.'s 
and fi(Xi, . . . , Xl) be arbitrary function with < fi < 1, 
i = 1, . . . , i. Then the condition 

E[f i (X 1 ,...,X L )\X 1 ,...,X i _ 1 ]<a a.s., i = !,...,£, 



implies that 

L 

L 



1 L 

7 ^/ i (X 1 ,...,X i )>T 



< exp (— L(rlog2 — a)) . 



Lemma 5 (|fl] Lemma 2]). Let the random vector U be 
uniformly distributed on the n-dimensional unit sphere. Then 
for every vector u on this sphere and any < a < 1, we 

have 

¥[\(U,u)\>a] < 2(l-a 2 )^. 

Proof of Lemma For notational convenience let us 
normalize all vectors s, V, and X(i, t) by 1/yfn in this proof. 

To derive the doubly exponential bound stated in the lemma, 
we use Lemma [4] To this end let us define the functions f t for 
1 < t < e nS ° as follows 

f t (X(i,l),...,X(i,t)) 

(X(j,t'),X(i,t) + s + V) 



> P + (X(i, t), s + V) for some j ^ i and t' 



Now, by using the functions f t , the probability expression in 
the statement of lemma can be written as follows 



- T r v 



(X(j,t'),X(i,T) + s + V)>P 



(X(i, T),s + V) for some j ^ i and t' 



> Ke 



-nSi 



^ $>v \< X U> tr ^ X ^ t) + s + V)>P 



> Ke 



[X(i, t),s + V) for some j 7^ i and t' 
i £ /t W> !)» • • • » X ^ *)) > Ke~ n ^ 



-nSi 



(ii) 



In order to bound ( fTTj ) we use Lemma |4] To this end, we 
have to bound the expected values of the functions f t . So we 



proceed as follows 

E c . [f t (X(i, 1), . . . , X(i, t))\X(i, 1), . . . ,X(i, t - 1)] 



= Er* 



{X(j,lf),X(i,t)+a + V) 



>P + (X(i,t),s) + (X(i,t),V) 



for some j ^ i and t' 



X(i,l),-..,X(i,t-l) 



(a) - 



: .[ |J {{X(j,t'),X(i,t) + s + V) 



(b) 
< 



> P+ (X(i,t),s + V)} 

rP C *[(X(i,t) )S + V) < -62] 

[j {(X(j,t'),X(i,t) + s + V) 
>P + (X(j,i),s + V)},(X(i,t),s + V) > -<$ 2 



(12) 



where (a) follows because X(i,t) are independent random 
variables so the conditioning can be removed and also using 
the fact that for an event A we have E c *Py[.4] = P c .Py[.A] 
and (b) follows from Fact [T] 

Now, for S 2 > 0, by using Fact [T] we can bound the first 
term of ( p~2] > as follows 

P v P c *[{X(i,t),s + V) < -6 2 ] 
<Pv[\\s + V\\ 2 >\\s\\ 2 + a 2 + S 2 ] 
+ P v P c »[(X(i,t),s + V) <-5 2 , 

\\s + V\\ 2 < \\s\\ 2 + a 2 + 6 2 ] 

<P V [\\S + V\\ 2 > ||s|| 2 +(T 2 +5 2 ] 

+ P v P c *[\(X(i,t),s + V)\ >S 2 , 



\s + V\\ 2 < \\s\\ 2 + a 2 + 6 2 \ 



(13) 



First note that ||s + V|| 2 = ||s|| 2 + || V|| 2 + 2(s, V). Then, since 
V = (Vi, . . . , V n ) is a sequence of i.i.d. Gaussian random 
variables iV(0, — ), the first term of (\3\ can be bounded as 
follows 

P v [\\s + V\\ 2 > \\s\\ 2 + a 2 + S 2 ] 

= p[||y|| 2 + 2( s ,y) ><j 2 + s 2 ] 



(a) 



< P[(s,V) > 77] 



|V|| 2 + 2r/>a 2 +(y 



Cb) 



(u,V) > +P[\\V\\ 2 >a 2 + 6 2 -2r 1 ], 

\\ s \\\ 

(14) 

where (a) follows from Fact [T] for r\ > and in (b) we define 
u = s/||s||. Because it is a unitary vector it is straightforward 
to show that (it, V) ~ AT(0, ^). Hence the first term in ( fT4| > 
can be bounded as follows 



Pv [(*, V)> ri] = Q 



ynrj 



<^exp 



i] 2 n 
2a 2 A 



(15) 



where in the above equation we have used the approximation ( 13 1, ( fT4| >, (15) , ( [To} , and fL7] > we can bound the first term in 

Q(y) < \ e ~^ ■ in order to bound the second term in ( fT4] > note \±3 as f°H 0WS 
that ^||V|| 2 has the Chi-squared distribution with n degree 
of freedom. Then by using [24 Lemma 1] we can bound the 
second term of ( fl4] > as follows 

5 2 - 27 7 s 



-\\V\\ Z > 1 



< exp y- t 
= exp(-£n), 



1 



a 2 



2 exp 



X(i,t),s + V) < -6a] < 
n - 1 <5 2 2 /P 



e ™ ? + -e . 



(18) 
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1 



5 2 -2r) 



-a/1 



(16) 

is a positive quan- 



Now we bound the second term in ( fT2| i as follows. Suppose 
.A denotes for the event {(X(i,t), s + V) > —5 2 } and let 
4> — (X(i, £), s + V). Then for the second term of ( [12") , we 
note that 



-/or x G r a ) 



where £ = k 
tity if (5 2 > 2rj 

Remark 2. Afofe f/iaf because y/1 + 2x < 1+x 
[0, 1] then by choosing x — S2 ~ 2r> we 
and exp(-<) < exp(-i(^2) 2 ). 

Now it remains to bound the second term of (13) . To this 
end let us write 

F V F C * [\{X(i,t),s + V)\ > 5 2 ,\\s + V\\ 2 < ||s|| 2 +<7 2 + 6 2 ] 

||s|| 2 +<7 2 +<52 f 

F v F c *[\{X(i,t),s + V}\ >6 2 

\\s + V\\ 2 =r dF(r) 

where F(r) — P \\\s + V\\ 2 < r]. Then we can write 

¥ V F C * [\(X(i,t),s + V)\>S 2 ,\\s + V\\ 2 < ||s|| 2 + <7 2 + 6 2 ] 



'vc* 



|J {{X(j,t'),X(i,t) + s + V)>P + cf>},A 



have $ > < HV [II* + VII 2 > Nil 2 + o 2 + 5 2 ] 



+ P 



vc* 



(b) , 



U {(I(j,('),I(M)+HV)>P + ^i,6 



1 _ nrj 2 

-| e 2<r 2 A 

2 







53 Py C . + s + >P + <f>,A,B], 



\s\\ 2 +<7 2 +6 3 



iyc +\ s + V S 2 



where (a) follows from Fact [TJ and we use S to denote the 
event {||s + V|| 2 < ||s|| 2 + a 2 + 6 2 }. The first two terms 
in (b) follow from ( fl4] >, (i5\ , and (jT6) while the third term 
is a result of the union bound. Let us define the unit vectors 

= prfe^in and 17 = \\xtut]tZv r Then we note 

that 



\s + V\\ 



dF{r) 



vc* 



(a). 



<5 2 /VP 



V¥FT*^+T 2 



(J {(J(j,t'),X(M) + S + F)>P + ^^ 



where U 



s+V 



,X(i,t) 



X(i,t) 



, and (a) is true because 



,.8+V||' - ||X(i,t)|| 

evaluating the term inside the integration for the point r = 
|| s || 2 + a 2 + 5 2 can only increase the probability term. Next, 
it follows that 

P y P c . [\(X(i,t),s + V)\>S 2 ,\\s + V\\ 2 < H 2 + <7 2 + 6 2 ] 

s 2 /Vp 



c* 



(X(i,t),u) > 



o 2 + 5 2 



U = u 



fu(u)du 



2 1- 



(a) 

< 



= 2 1- 



6 2 2 /P 



(a) c 1 ,„, 2 

< e + 2 e 
+ E F uc* 



(b) , 1 „„ 2 

< e~ n!; + -e 2^2 A 
+ E P r/c- 



(X(j,t'),U) 



> 



p 



x/P v /P+||s + F|| 2 + 2^ 
(X(j,t'),U) 



A 7 B 



fu(u)du 



> 



P-S 2 



S 2 2 /P 



VPVP + A + a 2 + 6 2 - 25 2 



(b) 

< 2 exp 



||s|| 2 +a 2 + 5 2/ 
n - 1 b 2 IP 



where in (a) we use the fact that ¥[£,A,B] < ¥[£\A,B] and 
(b) follows because by substituting ||s + V^|| 2 = A + a 2 + 5 2 
(17) and <f> — —S 2 the probability term in front of the summation 
in (a) can only increase; this implies that we can remove the 
where (a) follows from Lemma [5] and (b) follows from the conditioning with respect to events A and B. Now, by applying 
inequality 1 — x < e~ x for < x < 1. Finally, by combining Lemma [5] we can further bound the second term of ( p~2] > as 



follows 



References 



vc 



(J {(X(j,t'),X(i,t) + 8 + V)>P + 4>},A 



< e 



+ 2e n{R+s ^ I 1 - P 5 ' 2 



< e " ? + -e 



2e 



n( Wo ) + a^ log M_ __|_ 



(19) 



where 8' 2 = 2^PS 2 - S\. 

Finally, by combining ( fT8| and (jT9j we can write the 
following bound for the expectation of functions f t 

E C . [/ t (X(i, 1), . . . , X(i, t))\X(i, 1), . . . , X(i, t - 1)] 
n-1 S 2 2 /P 



< 2 exp 



+ 2e 



2 INI 

n(i?.+,5o)+^ log (l- 



<T 2 +S 2 

p—sL 



By making some more assumptions on Jq, <$2, ?7, -R, and 
introducing Si, we can simplify the upper bounds on the 
expected values of functions f t as follows 

E c . [ft(X(i, 1), . . .,X(i,t))\X(i, 1), . . .,X(i,t- 1)] 



exp 



2exp -(— — ) ^ 

V 2 2(|| S || 2 +a 




(b) 



< 2 exp (—n5i) + exp (— nSi) + 2 exp (— nSi) + 2 exp (—nSi) 

< 10 exp (— nSi) 

where (a) follows by Remark]^ assuming S 2 < ||s|| 2 + (x 2 , and 
choosing 



R< 1 ^\o g [l 



P-SL 



A + a 2 - So + S' 



Sa - Si, 



(b) follows by assuming the conditions S 2 > 2rj + 4cr 2 y / 5i, 
ri > V2AaWi, and S 2 > y/ iP ?+% )Sl ■ 

Then by applying Lemma |4] and choosing a — 10e~ nSl and 
t = Ke~ nSl we have 

^X) P v[W,* / )^(i > *) + * + V>>P 
{=i 

+ (X(i, t),s + V) for some j ^ i and t'] > Ke' nSl 

< exp y— exp(n5 )^A"log2exp(— nSi) — 10exp(— nSi) 

= exp ^ - (X log 2 - 10) exp(n(,5 - . 

By assuming So > Si > we obtain the desired doubly 
exponential bound, hence we are done. ■ 
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